Read Binary
(Operator Toolbox)
Synopsis
This operator reads packed binary data (records/structs) into data tables.Description
The operator can be used to read any sequence of fixed length records. It supports common numerical types as well as embedded strings.
The format of a single record needs to be specified by the user. For example, the format string 3i4x describes a record consisting of 3 signed 32Bit integers followed by 4 padding bytes. The string 4f16s a record consisting of 4 32Bit floats followed by an embedded 16 byte long string.
Input
file
The input file containing packed binary data.
Output
data
The output data table.
Parameters
- file The input file containing packed binary data.
- format
The format string describing a single record.
The characters b, h, i, and l denote 8Bit, 16Bit, 32Bit, and 64Bit signed integers respectively. The uppercase characters B, H, I, and L denote their unsigned counterparts.
The characters f and d denote 32Bit and 64Bit IEEE floating point numbers.
The character s denotes an embedded and potentially null terminated string.
The character x denotes a padding byte that will not be mapped to any column in the resulting data table.
All characters can be combined with an optional quantifier. For example, 4f denotes a sequence of 4 32Bit floating point numbers and is equivalent to the expression ffff. One exception is the handling of strings. The expression 16s denotes a single string of length 16 bytes (instead of 16 strings of length 1).
- offset Bytes to skip at the beginning of the file. This parameter can be used to skip magic bytes or file headers that are not part of the actual data set.
- byte order The byte order or Endianness of the input data. Little-endian representations store the least-significant byte first, big-endian representations the most-significant byte first.
- encoding The encoding of embedded strings (if any).
- encoding Whether to trim embedded strings at the first null character (if any).
Tutorial Processes
Reading numbers and strings from a binary file
In this tutorial we use the Read Binary operator to read both numbers and strings from a binary input file.
Please take note that in this example, we create a binary file using the Create Document operator using human-readable characters only. This allows inspecting the file contents in the workflow editor. However, in a real world scenario, the file would likely contain data that cannot be displayed as text.
The example reads both 32Bit floating point numbers and 8Bit integers from the input file. An example of the former is the character sequence 'YuCA', which would be displayed as '0x41437559' in a hex editor and corresponds to the floating point number '12.216' (little endian).
The unsigned 8Bit integers are simply the ASCII code points of the characters. The example strings are read without further conversions into the output table.